CONE: Metrics for Automatic Evaluation of Named Entity Co-Reference Resolution

نویسندگان

  • Bo Lin
  • Rushin Shah
  • Robert E. Frederking
  • Anatole Gershman
چکیده

Human annotation for Co-reference Resolution (CRR) is labor intensive and costly, and only a handful of annotated corpora are currently available. However, corpora with Named Entity (NE) annotations are widely available. Also, unlike current CRR systems, state-of-the-art NER systems have very high accuracy and can generate NE labels that are very close to the gold standard for unlabeled corpora. We propose a new set of metrics collectively called CONE for Named Entity Coreference Resolution (NE-CRR) that use a subset of gold standard annotations, with the advantage that this subset can be easily approximated using NE labels when gold standard CRR annotations are absent. We define CONE B 3 and CONE CEAF metrics based on the traditional B 3 and CEAF metrics and show that CONE B 3 and CONE CEAF scores of any CRR system on any dataset are highly correlated with its B 3 and CEAF scores respectively. We obtain correlation factors greater than 0.6 for all CRR systems across all datasets, and a best-case correlation factor of 0.8. We also present a baseline method to estimate the gold standard required by CONE metrics, and show that CONE B 3 and CONE CEAF scores using this estimated gold standard are also correlated with B 3 and CEAF scores respectively. We thus demonstrate the suitability of CONE B 3 and CONE CEAF for automatic evaluation

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

Text Segmentation using Named Entity Recognition and Co-reference Resolution

In this paper we examine the benefit of performing named entity recognition (NER) and co-reference resolution to an English and a Greek corpus used for text segmentation. The aim here is to examine whether the combination of text segmentation and information extraction can be beneficial for the identification of the various topics that appear in a document. NER was performed manually in the Eng...

متن کامل

Using cohesive properties of text for Automatic Summarization

A system allowing extractive automatic summarization of textual documents is presented. The system is based on the cohesive properties of text, namely lexical chains, co-reference chains and named entity chains. In this way the system extend the well known lexicalchaining paradigm for summarization. The system has been applied to summarization tasks on Spanish agency news. Results of its evalua...

متن کامل

Application of association rules mining to Named Entity Recognition and co-reference resolution for the Indonesian language

In this paper, we propose a new method, association rules mining for Named Entity Recognition (NER) and co-reference resolution. The method uses several morphological and lexical features such as Pronoun Class (PC) and Name Class (NC), String Similarity (SP) and Position (P) in the text, into a vector of attributes. Applied to a corpus of newspaper in the Indonesian language, the method outperf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010